This file contains the R code and data to produce the final manuscript results. 
Two data downloads are required to run the code -- instructions are provided below.

CONTENTS -- Results_With_Emissions contains two sub-directories:
1) Code : The R code to generate results -- including Figures 1-4 and S1-S10 and supplemental datasets S1-S5 in Choma, E.F., Evans, J. S., Gomez-Ibanez, J. A., Di, Q., Schwartz, J., Hammitt, J. K., Spengler, J.D. (2021). Health benefits of decreases in on-road transportation emissions in the United States from 2008 to 2017. Accepted for publication at Proceedings of the National Academy of Sciences of the United States of America.

2) Inputs: The model inputs used in the code. 
a) Marginal Damages Model Results (provided with the code)
b) NEI Emissions data, already processed (provided with the code)
c) Auxiliary Population data (provided with the code)
d) County List (provided with the code)
e) Baseline Ambient PM2.5 levels (provided with the code)
f) County data from the U.S. Census Bureau (needs to be downloaded)
g) Metropolitan Area data from the U.S. Census Bureau (needs to be downloaded)
h) U.S. Census Bureau Cartographic Boundary File for U.S. counties at 1:500,000 resolution level (needs to be downloaded)
Detailed descriptions on model inputs and where to download f), g), and h) are provided below


------------


DESCRIPTION OF INPUTS: INPUTS THAT NEED TO BE DOWNLOADED

f) County Data from the US Census Bureau
Source: U.S. Census Bureau, 2020
Dataset name: 'co-est2019-alldata.csv'
Available at: https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/counties/totals/co-est2019-alldata.csv
Accessed on May 03, 2021
Data Last Modified, according to US Census Bureau: 03/26/2020 09:40
File Size: 3.5M

g) Metropolitan Area data from the U.S. Census Bureau
Data Source: U.S. Census Bureau, 2020 
Dataset name: 'cbsa-est2019-alldata.csv'
Available at: https://www2.census.gov/programs-surveys/popest/datasets/2010-2019/metro/totals/cbsa-est2019-alldata.csv
Accessed: May 03, 2021
Data Last Modified, according to US Census Bureau: 03/26/2020 09:41
File Size: 1.2M

h) U.S. Census Bureau Cartographic Boundary File for U.S. counties at 1:500,000 resolution level (required to plot Figure 4 only)
Data Source: U.S. Census Bureau
Available at: https://www2.census.gov/geo/tiger/GENZ2019/shp/cb_2019_us_county_500k.zip
Accessed: May 04, 2021
Data Last Modified, according to US Census Bureau: 05/01/2020 13:27
File Size: 11M


------------


DESCRIPTION OF INPUTS: INPUTS PROVIDED WITH THE MODEL

a) Marginal Damages Model Results (provided with the code): (a.i) Marginal Values, and (a.ii) Source-Receptor Matrix (SRM) for 2017
These results are provided with the code, as inputs in this Results_With_Emissions file, 
but and they are also outputs from the Marginal Damages Model and provided as outputs in the the Marginal_Damages_Model file.
a.i) Marginal Values: 
Four .RData files. Filenames are: MV_Damages_BasePM_[PMYear]_Mortality_[MYear].RData
Where:
PMYear is the year of baseline ambient PM 2.5 concentrations data, and can be 2008 or 2017
MYear is the year of baseline mortality data, and can be 2008 or 2017

Once loaded in R, each .RData file contains 4 "datasets"/"variables", one for each CRF. 
These "datasets" are in fact matrices of dimension 3,108 x 5 where:
The rows are sources (3,108 counties);
The 5 columns are one for each pollutant. In order: Primary PM2.5, SO2, NOX, NH3, and VOC

These matrices/variables are named:
Marg.Damages.[CRF]_BasePM_[PMYear]_Mortality_[MYear]

Where:
CRF is one of GEMM, Vodonos.Parametric, Vodonos.Spline, or Krewski
PMYear is the year of baseline ambient PM 2.5 concentrations data, and can be 2008 or 2017
MYear is the year of baseline mortality data, and can be 2008 or 2017

a.ii) Source-Receptor Matrix for 2017
Filename: SRM_Damages_BasePM_2017_Mortality_2017.RData
Only the SRM with baseline ambient PM2.5 concentrations for 2017 and mortality data for 2017 is used.

Once loaded in R, this .RData file contains 4 "datasets"/"variables", one for each CRF. 
These "datasets" are in fact arrays of dimension 3,108 x 3,108 x 5 where:
The rows are sources (3,108 counties);
The columns are receptors (3,108 counties); and
The 5 slices are one for each pollutant. In order: Primary PM2.5, SO2, NOX, NH3, and VOC

These arrays/variables are named:
Marg.Damages.[CRF]_BasePM_[PMYear]_Mortality_[MYear]

Where:
CRF is one of GEMM, Vodonos.Parametric, Vodonos.Spline, or Krewski
PMYear is the year of baseline ambient PM 2.5 concentrations data, and can be 2008 or 2017
MYear is the year of baseline mortality data, and can be 2008 or 2017


b) NEI Emissions data, already processed (provided with the code)
Source: U.S. Environmental Protection Agency (2018a; 2018b; 2020a; 2020b)
These emissions are provided with the code, as inputs in this Results_With_Emissions file, 
but and they are also outputs from the preprocessing of model input data and provided as outputs in the the Preprocessing_of_Model_Input_Data file.
Four files are provided: NEIxxxx.RData, where (xxxx denotes the NEI year)

Each .RData file contains the following datasets:
b.i) NEIxxxx.VTYPE.EF (array of 3,108 counties x 14 vehicle types x 11 columns)
Emission factors (in g/mile) for each of 13 vehicle types + for all vehicles
Each row (dimension 1) denotes a county
Each slice (dimension 3) denotes a vehicle type
The columns (dimension 2) are:
1: State & County FIPS code
2: Vehicle type
3: VMT (in miles)
4: Primary PM 2.5 emission factor (g/mile)
5: SO2 emission factor (g/mile)
6: NOx emission factor (g/mile)
7: NH3 emission factor (g/mile)
8: VOC emission factor (g/mile)
9: CO2 emission factor (g/mile)
10: CH4 emission factor (g/mile)
11: N2O emission factor (g/mile)

The slices (dimension 3) are EPA's Vehicle Types/Source Types as used in NEI 2017 (slices 1-13) and all vehicles combined (slice 14):
1: Motorcycle (VTYPE = 11)
2: Passenger Car (VTYPE = 21)
3: Passenger Truck (VTYPE = 31)
4: Light Commercial Truck (VTYPE = 32)
5: Intercity Bus (VTYPE = 41)
6: Transit Bus (VTYPE = 42)
7: School Bus (VTYPE = 43)
8: Refuse Truck (VTYPE = 51)
9: Single Unit Short-Haul Truck (VTYPE = 52)
10: Single Unit Long-Haul Truck (VTYPE = 53)
11: Motor Home (VTYPE = 54)
12: Combination Short-Haul Truck (VTYPE = 61)
13: Combination Long-Haul Truck (VTYPE = 62)
14: All vehicles combined (VTYPE = 99 in the dataset, although this is not used by EPA)

b.ii) NEIxxxx.VTYPE.EMIS
This is similar to a), with the difference that total emissions (in short tons) are provided in columns 4-11

b.iii) NEIxxxx.REF.EMIS
These are VOC refueling emissions, which are not included with any of the 13 vehicle types in files a) and b) but are included in the emissions of all vehicles in those files (slice 14).
Note that it is not provided for 2008 since that NEI does not list refueling emissions separately
These datasets are 3,108 x 5 data frames.
The columns are:
1: State & County FIPS code
2: Vehicle Type (which we use as 0 for refueling emissions)
3: VMT for all vehicles combined in each county
4: VOC emissions (in short tons)
5: VOC emission factors (in g/mile, where mile is the sum of VMT for all vehicles in each county) 

We indicate VTYPE = 0 for the refueling datasets.


c) Auxiliary Population data (provided with the code) from CDC/HHS
Source: U.S. Department of Health and Human Services (2020)
This data is provided as a dataset of 161,642 lines x 5 columns
Columns are: 
1: FIPS State + County Code -- 3,108 counties for 2017 and 2014-2018; 3,109 counties for 2008 and 2006-2010
2: Age Group -- eleven 5-year age groups from 25-29 to 75-79; 80 and older; all ages
3: Age Group Code -- 0 for all ages, 1-11 for the five-year age groups, 12 for 80+
4: Population -- Population Counts
5: Year -- 4 'years' : 2008, 2017, 2006-2010 (5-year count), 2014-2018 (5-year count)


d) County List (provided with the code)
List of all 3,108 FIPS State + County codes


e) Baseline Ambient PM2.5 levels (provided with the code)

The columns are:
1: STCOU: FIPS State + County Code
2: Estimate of baseline ambient PM2.5 levels in 2008 (county annual average) [ug/m3], given by variable ‘Ambient_PM25_2008’
3: Estimate of baseline ambient PM2.5 levels in 2017 (county annual average) [ug/m3], given by variable ‘Ambient_PM25_2017’

This dataset contains 3 columns x 3,108 rows. Each row represents a county. 
Data source: these ambient levels were generated using the 1-km resolution estimates by Di et al. (2019). Di et al. (2019) include estimates only until 2016, so we used that year (2016) as a proxy for 2017 concentrations. We produced county level estimates weighting by population at a Census block level using data from the 2010 Decennial Census (U.S. Census Bureau, 2011). We assigned a concentration to each Census Block by using the weighted average of the nearest 4 cell centroids from Di et al. [13] and weighting by inverse distance. We then aggregated the Census Blocks to counties.




------------


REFERENCES:

Emissions:
NEI 2008: 
[dataset] U.S. Environmental Protection Agency, 2018a. Data from “2008 National Emissions Inventory (NEI)”, 2008nei_v3.
NEI 2011: 
[dataset] U.S. Environmental Protection Agency, 2018b. Data from “2011 National Emissions Inventory (NEI)”, 2011nei_v2. 
NEI 2014: 
[dataset] U.S. Environmental Protection Agency, 2020a. Data from “2014 National Emissions Inventory (NEI)”, 2014nei_v2. 
NEI 2017: 
[dataset] U.S. Environmental Protection Agency, 2020b. Data from “2017 National Emissions Inventory (NEI)”, 2017nei_v1 (Apr 2020). 

Population Counts:
[dataset] U.S. Department of Health and Human Services, 2020. United States Department of Health and Human Services (US DHHS), Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), Bridged-Race Population Estimates, United States July 1st resident population by state, county, age, sex, bridged-race, and Hispanic origin. Compiled from 1990-1999 bridged-race intercensal population estimates (released by NCHS on 7/26/2004); revised bridged-race 2000-2009 intercensal population estimates (released by NCHS on 10/26/2012); and bridged-race Vintage 2019 (2010-2019) postcensal population estimates (released by NCHS on 7/9/2020). Available on CDC WONDER Online Database. http://wonder.cdc.gov/bridged-race-v2019.html (accessed 30 November 2020).

Ambient PM2.5 Levels:
Q. Di et al. (2019). An ensemble-based model of PM2.5 concentration across the contiguous United States with high spatiotemporal resolution. Environment International 130, 104909 (2019). https://doi.org/10.1016/j.envint.2019.104909. 
[dataset] U.S. Census Bureau. (2011). Data from “2010 TIGER/Line Shapefiles with 2010 Census block geography and the 2010 Census population and housing unit counts.” https://www2.census.gov/geo/tiger/TIGER2010BLKPOPHU/ (accessed 11 September 2020).




